Influence of the mating design on the additive genetic variance in plant breeding populations

Key message Mating designs determine the realized additive genetic variance in a population sample. Deflated or inflated variances can lead to reduced or overly optimistic assessment of future selection gains. Abstract The additive genetic variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA inherent to a breeding population is a major determinant of short- and long-term genetic gain. When estimated from experimental data, it is not only the additive variances at individual loci (QTL) but also covariances between QTL pairs that contribute to estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA. Thus, estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA depend on the genetic structure of the data source and vary between population samples. Here, we provide a theoretical framework for calculating the expectation and variance of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA from genotypic data of a given population sample. In addition, we simulated breeding populations derived from different numbers of parents (P = 2, 4, 8, 16) and crossed according to three different mating designs (disjoint, factorial and half-diallel crosses). We calculated the variance of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA and of the parameter b reflecting the covariance component in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A},$$\end{document}VA, standardized by the genic variance. Our results show that mating designs resulting in large biparental families derived from few disjoint crosses carry a high risk of generating progenies exhibiting strong covariances between QTL pairs on different chromosomes. We discuss the consequences of the resulting deflated or inflated \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA estimates for phenotypic and genome-based selection as well as for applying the usefulness criterion in selection. We show that already one round of recombination can effectively break negative and positive covariances between QTL pairs induced by the mating design. We suggest to obtain reliable estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V}_{A}$$\end{document}VA and its components in a population sample by applying statistical methods differing in their treatment of QTL covariances. Supplementary Information The online version contains supplementary material available at 10.1007/s00122-023-04447-2.


Table S1 Analysis of molecular variance (AMOVA) based on Euclidean distances between gametes from
both ancestral populations Elite and Landrace.All variance components were significant (10,000 permutations, P < 0.05).

Source of variation Degrees of freedom Variance component
Between

Figure S1
Figure S1 Histograms of | , | + | , | + | , and | for 10,000 samples of allele effects ∼, for L = 1000 QTL using the disjoint cross (DC) mating design for ∈ {2,4} parental lines sampled from ancestral population Elite and N = 1000 genotypes for producing one replication of generation G1-DH.The values in the window give the mean, variance, skewness and kurtosis of the 10,000 realized values for each parameter.The values in the parentheses give the conditional mean and variance calculated for this replication and set of QTL using equationsEqs.6, 7, 8, 9 and 10.

Figure S2
Figure S2 Histograms of | = | + | , | and | (A) and | , | , and | (B) for 10,000 samples of allele effects ∼ , for L = 1000 QTL using the disjoint cross (DC) mating design for ∈ {2,4} parental lines sampled once from ancestral population Elite and N = 1000 genotypes for producing one replication of generation G1-DH.The values in the window give the mean, variance, skewness and the percentage of values with a negative sign of the 10,000 realized values for each parameter.

Figure S3
Figure S3Density estimation of the minor allele frequencies at the 2,500 potential QTL positions in ancestral population Elite (grey) and Landrace (yellow).

Figure S4
Figure S4 Estimated variance of decomposed into the parts attributable to the genic variance ( , magenta), the part attributable to covariances between QTL pairs on different chromosomes ( ! " , yellow) and on the same chromosome ( ! " , #reen) in generation G1-DH for different numbers of parental lines ∈ {2,4,8,16} sampled from ancestral population Elite (A) and Landrace (B) and using three mating designs (disjoint cross (DC), factorial cross (FC), and half-diallel cross (HC)) for producing generation G1 and N = 1000 genotypes for producing G1-DH and L = 1000 QTL.The number of crosses generated in the respective mating design is shown above the bars.

Figure S6
Figure S6 Decay of the estimated variance of from generation G1-DH to G4-DH for varying numbers of parental lines ∈ {2, 4, 8, 16} sampled from ancestral population Elite using the mating design factorial cross (FC) with ∈ {50, 250, 1000} genotypes for producing generations G1 to G4 and G1-DH to G4_DH with L = 1000 QTL.The circles show the variance of estimated in the simulations, the asterisks show the expected decay based on the value in G1-DH for a population with = ∞ and the triangles show the decay estimated with the non-linear regression with Eq. 19.

Figure S7
Figure S7Probability mass function of P[Y=y ; ' = 1000, , = 0.5] (A) and cumulative distribution function of !. ≤ 0; ' = 1000, , = 0.5" (B), where Y is the random variable referring to the difference in the number of QTL pairs, where the product of the allele effects is positive, minus the number of QTL pairs, where it is negative, for L = 1000 QTL assuming that the probability of a positive effect of the reference allele is , = 0.5.The events with .< 0 and Y ≥ 0 are colored in orange and blue, respectively.

Figure S8
Figure S8 Histograms for the off-diagonal elements dij (4 x gamete phase disequilibrium) of matrix D for pairs of 2500 QTL located on the same (W) or on different chromosomes (B), as well as for all elements in W & B for ancestral populations Elite and Landrace.